Structural analysis of a chromatin model

Structural models generated in previous tutorials, may also be analyzed in a more deeper way. In this tutorial some classical methods used by structural biologists, and included in TADbit are described.

This methods are mainly designed to be used on single models, we may thus start this tutorial by loading a single structural model:

from pytadbit.imp.impmodel import load_impmodel_from_cmm, load_impmodel_from_xyz

model = load_impmodel_from_cmm('./model.3261.cmm')

General shape of the model

In this section are described some methods to grossly describe the three-dimensional occupancy of a model.

Finding the center of mass and radius of gyration

The first method that allows to quickly understand how dense or compact is a model consists in the calculation of its radius of gyration (see pytadbit.imp.impmodel.IMPmodel.radius_of_gyration()) and its center of mass:

print model.center_of_mass()
print model.radius_of_gyration()
{'y': -11976.696977752474, 'x': 1459.4080388960351, 'z': -5505.949389580002}
2107.38031986

As an extra feature, radius of gyration (or gyradius) can also be seen within chimera:

model.view_model(tool='chimera_nogui', savefig='/tmp/image_model_2.png',centroid=True, gyradius=True)
model.view_model(tool='chimera_nogui', centroid=True, gyradius=True, savefig='/tmp/image_model_2.webm')
../_images/tutorial_7_single_model_analysis_11_0.png

Model length

print model.contour()
100002.254636

The length of the chromatin strand modeled is thus 100002 nm long.

Fitting model into a cube

Find the longest and shortest distance between 2 particles:

print model.longest_axe()
print model.shortest_axe()
7033.67649175
374.538753758

Characterize a cube that includes the model:

print model.cube_side()
print model.cube_volume()
8861.41178682
6.95838983082e+11

Chromatin accessibility: fitting objects inside the model

In order to infer which part of the modeled chromatin can be accessed by an object, like the transcription machinery, TADbit calculates a mesh around the model, and checks for each point of this mesh if an object of a given size can fit.

Here an example revealing the surface of a chromatin strand accessible to a hypothetical protein of 400 nanometers (radius of 200 nanometers):

acc_dots, tot_dots, acc_area, tot_area, acc_vs_inacc = model.accessible_surface(
                200, nump=100, verbose=True, write_cmm_file='./model_mesh.cmm',
                savefig='/tmp/model_mesh.webm', chimera_bin='chimera_nogui')
Accessible surface: 90.34 micrometers^2(17972 accessible times 0.00502654824574 micrometers)
   (17972 accessible dots of 22080 total times 0.00503 micrometers)
 - 81.39% of the contour mesh
 - 71.6% of a virtual straight chromatin (126.17 microm^2)

The function bellows gives an important amount of information.

  • The text printed (when verbose=True), corresponds to some general statistics about the accessibility of the chromatin.
    • In this example 81% of the chromatin is accessible by the hypothetical protein. This number does not only includes particles, but also the edges linking the particles (remember that a particle is a representation of a given locus of DNA). The second percentage printed corresponds to the percentage of accessible chromatin without taking into consideration its folding (or considering a straight strand of chromatin).
    • As stated above, in order to infer the proportion of accessible chromatin, a mesh is drawn around the chromatin strand. This mesh represents all possible position of the hypothetical protein. Information about surface are relative to this mesh, not to the real accessible surface of the chromatin. However the number are proportional, and the percentages conserved.
    • The dots also mentioned in the output are the representation of the mesh, their number is proportional to the nump parameter. The accessibility is measures using this dots, if a dot is distant enough from any point of the chromatin strand, than it is considered as accessible; while if some part of the chromatin lies closer than the radius of the hypothetical protein to one dot, this dot is considered inaccessible as this protein could not fit in its place. See the movie below (generated using the savefig parameter) for a better understanding, dots are displayed in green when they represented possible placement of the hypothetical protein, or in red when the protein would not fit.

In order to measure how “buried” are each particles, the functions returns a list of values (that, in the example above, we store under the acc_vs_inacc variable). This list contains, for each particle, the number of “green dots” and the number of “red dots”. A useful value that is the buried percentage of each particle (100% mean that the particle is completely inaccessible for the given protein).

Following with the example, these number can be obtained using the acc_vs_inacc list:

for i, (acc, ina) in enumerate(acc_vs_inacc):
    print 'particle %3s: %4.1f%% buried'%(i+1, float(ina)/(acc+ina)*100),
    print ('|' if (i+1)%4 else '\n'),
particle   1:  0.0% buried | particle   2:  0.0% buried | particle   3:  0.0% buried | particle   4:  0.0% buried
particle   5:  0.0% buried | particle   6:  0.0% buried | particle   7:  2.3% buried | particle   8: 39.3% buried
particle   9:  0.0% buried | particle  10:  3.4% buried | particle  11:  0.0% buried | particle  12:  0.0% buried
particle  13:  0.0% buried | particle  14:  0.0% buried | particle  15:  0.0% buried | particle  16:  0.0% buried
particle  17: 19.0% buried | particle  18:  0.0% buried | particle  19:  0.0% buried | particle  20:  0.0% buried
particle  21:  0.0% buried | particle  22:  0.0% buried | particle  23:  0.0% buried | particle  24:  0.0% buried
particle  25:  0.0% buried | particle  26:  6.7% buried | particle  27: 41.7% buried | particle  28: 24.5% buried
particle  29:  0.0% buried | particle  30:  0.0% buried | particle  31:  0.0% buried | particle  32: 18.8% buried
particle  33:  0.0% buried | particle  34: 69.6% buried | particle  35:  2.1% buried | particle  36:  0.0% buried
particle  37:  2.0% buried | particle  38:  0.0% buried | particle  39:  0.0% buried | particle  40:  0.0% buried
particle  41:  0.0% buried | particle  42:  5.7% buried | particle  43:  0.0% buried | particle  44: 66.7% buried
particle  45:  0.0% buried | particle  46:  0.0% buried | particle  47:  4.2% buried | particle  48: 57.1% buried
particle  49: 47.3% buried | particle  50:  3.7% buried | particle  51: 44.8% buried | particle  52:  5.7% buried
particle  53:  0.0% buried | particle  54:  5.0% buried | particle  55:  5.6% buried | particle  56: 38.1% buried
particle  57:  0.0% buried | particle  58:  0.0% buried | particle  59: 32.6% buried | particle  60:  0.0% buried
particle  61:  8.1% buried | particle  62:  0.0% buried | particle  63:  0.0% buried | particle  64: 26.5% buried
particle  65:  0.0% buried | particle  66:  0.0% buried | particle  67: 11.9% buried | particle  68:  0.0% buried
particle  69:  1.9% buried | particle  70:  4.5% buried | particle  71:  0.0% buried | particle  72:  2.6% buried
particle  73:  0.0% buried | particle  74: 17.1% buried | particle  75:  0.0% buried | particle  76:  4.3% buried
particle  77:  0.0% buried | particle  78:  7.3% buried | particle  79: 43.3% buried | particle  80:  0.0% buried
particle  81:  0.0% buried | particle  82:  4.9% buried | particle  83:  0.0% buried | particle  84:  0.0% buried
particle  85:  0.0% buried | particle  86:  0.0% buried | particle  87:  0.0% buried | particle  88:  0.0% buried
particle  89:  0.0% buried | particle  90:  5.4% buried | particle  91: 25.0% buried | particle  92:  0.0% buried
particle  93:  0.0% buried | particle  94:  0.0% buried | particle  95:  5.6% buried | particle  96:  0.0% buried
particle  97:  0.0% buried | particle  98:  0.0% buried | particle  99:  2.0% buried | particle 100:  0.0% buried

Note that in this example no particle is 100% buried.

In order to visualize what really mean this result, the mesh can be displayed only around particles, setting the option include_edges to False. In this case, global value of accessibility of the chromatin will change, but the individual statistics of particles will be kept.

In the movie above, are shown this time only the relevant part of the mesh for each particle. Note that only a part of the sphere surrounding particles is displayed, as nearby edges are impeding the protein to come by the given particle. For more details on how the mesh is build refer to the function documentation: pytadbit.imp.impmodel.IMPmodel.accessible_surface()

acc_dots, tot_dots, acc_area, tot_area, acc_vs_inacc = model.accessible_surface(
                200, nump=100, verbose=True, include_edges=False, write_cmm_file='./model_partmesh.cmm',
                savefig='/tmp/model_partmesh.webm', chimera_bin='chimera_nogui')
Accessible surface: 19.46 micrometers^2(3872 accessible times 0.00502654824574 micrometers)
   (3872 accessible dots of 4125 total times 0.00503 micrometers)
 - 93.87% of the contour mesh
 - 15.43% of a virtual straight chromatin (126.17 microm^2)
Fork me on GitHub